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EVALUATION 


The  new  general  mathematical  framework  described  in  the  report  provides 
a basis  for  extending  many  existing  "digital"  decoding  techniques  Into 
"multiplicative"  decoding  techniques  which  can  be  applied  directly  to  the 
unquantized  received  word.  Within  the  framework  an  optimal  decoder  was 
formulated — optimal  in  the  sense  that  it  provides  the  minimum  symbol  error 
rate  possible  from  the  received  word.  In  practice,  except  for  short 
codes,  one  almost  always  needs  to  back  off  from  such  a formulation  to  reduce 
complexity.  Fortunately,  at  least  in  the* "add i t ive"  domain,  extreme  re- 
ductions in  complexity  are  often  possible  which  do  not  significantly  impact 
performance.  The  preliminary  results  obtained  indicate  a similar  trend 
in  the  "multiplicative"  domain. 

The  significance  of  the  results  and  some  possibilities  for  future 
application  of  the  results  are  indicated  in  Section  5.  Work  during  the 
remainder  of  the  effort  will  focus  upon  obtaining  the  tradeoffs  among 
performance,  decoding  time  and  hardware  complexity  as  indicated  in  Section 
5. 


The  utilization  of  coding  in  various  communication  applications  is 
increasing  as  decoding  complexity  decreases.  The  recent  Troposcatter 
Interleaver  Contract  F30602-74-C-01 33  demonstrated  the  usefulness  of 
coding  for  high  speed  tropo  applications.  An  on-going  contractual  effort 
F30602-7b-C-0361  titled,  Demod/Decoder  Integration  indicates  very  significant 
performance  gains  are  also  attainable  on  high  speed  microwave  line-of-sight 
channels.  The  results  to  date  under  this  effort  provide  a basis  for  extending 
from  hard  decision  decoders  to  decoders  which  utilize  soft  decisions.  While 
additional  devel^  aental  work  is  required,  the  results  should  be  useful  in 
the  eventual  development  of  powerful  practical  decoders. 


FREDERICK  D.  SCHMANDT 
Project  Engineer 


Section  1 


INTRODUCTION 


This  report,  presents  the  most  recent  results  of  an  inves- 
tigation into  the  complexity  of  decoding  error-correcting  codes 
and  the  development  of  efficient  and  practical  decoding  tech- 
niques. For  earlier  results,  the  reader  is  referred  to 
Technical  Report  RADC-TR-7 4-297 , "Decoding  Complexity  Study", 
November  1974. 

A major  objective  of  this  continuing  research  is  to  demon- 
strate that  error-correcting  codes  are  capable  of  providing 
reliable  data  transmission  in  a wide  range  of  applications  at  a 
reasonable  cost.  We  are  convinced  that  the  key  to  widespread 
application  of  coding  lies  in  understanding  and  exploiting  the 
laws  that  govern  the  trade-off  between  code  performance  and 
decoder  complexity. 

It  is  intuitively  clear  that  the  complexity  of  decoding 
increases  ever  more  rapidly  as  the  upper  limit  in  performance  is 
approached.  Because  of  the  steep  slope  of  the  performance- 
complexity  curve  as  it  approaches  the  performance  limit,  we  are 
quite  willing  to  suffer  a small  reduction  in  performance  for 
the  large  reduction  in  complexity  that  should  result.  The  pro- 
blem is  to  make  sure  that  the  full  reduction  in  complexity  paid 
for  by  the  loss  in  performance  is  actually  obtained. 

A logical  approach  to  this  problem  would  be  to  determine 
the  optimum  performance-complexity  trade-off  and  then  devise 
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techniques  which  approach  or  achieve  this  ideal.  Unfortunately, 
we  do  not  yet  have  a Shannon-type  result  which  tells  us, 
quantitatively,  just  how  well  we  can  expect  to  do.  However, 
this  does  not  prevent  us  from  invoking  general  principles  to  de- 
duce properties  that  a coding  system  would  have  to  have  in  order 
to  occupy  a position  near  the  theoretical  performance-complexity 
curve.  Consideration  of  such  properties  led  to  the  formulation 
of  the  following  three  heuristics: 

Rule  1:  Do  not  impose  any  restrictions  on  a code 

beyond  those  necessary  to  obtain  a given 
decoding  advantage. 

Rule  2:  Make  sure  that  the  decoder  fully  utilizes 

any  restrictions  that  are  placed  on  the  code. 

Rule  3:  Do  not  impose  any  restrictions  on  the  decoder 

beyond  those  necessary  to  obtain  a given 
decoding  advantage. 

Application  of  Rule  1 to  the  class  of  finite-geometry  codes 
resulted  in  the  development  of  several  classes  of  generalized 
finite-geometry  codes  which  achieve  significantly  improved  code 
performance  for  the  same  decoding  complexity ^ Application 

of  Rule  2 to  traditional  majority  decoding  methods  for  cyclic 
codes  resulted  in  the  discovery  of  a new  decoding  algorithm 
which  achieves  the  same  code  performance,  but  with  a drastic 
reduction  in  decoder  complexity^  The  research  carried  out 

most  recently  was  suggested  by  the  application  of  Rule  3 to  the 
question  of  whether  the  performance  lost  by  hard-decision 
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demodulation  is  justified  by  the  reduction  in  decoder  complex- 
ity(8-14). 

In  a digital  communication  system  with  one  level  of  coding 
(modulation-demodulation) , it  is  natural  to  design  the  demod- 
ulator to  make  hard  0-1  decisions  in  such  a way  that  the 
probability  of  bit  error  is  minimized.  However,  when  a second 
level  of  coding  (error-control  encoding-decoding)  is  added,  this 
demodulation  strategy  is  no  longer  appropriate.  In  a communi- 
cation system  using  two-level  coding,  the  transmitted  bit  stream 
must  satisfy  known  algebraic  constraints.  To  make  hard  0-1 
decisions  without  regard  to  these  constraints  is  to  throw  away 
information  and  degrade  the  performance  of  the  system.  This 
situation  was  tolerated  for  a time  because  it  was  thought  that 
the  loss  in  performance  at  the  output  of  the  demodulator  was 
justified  by  the  simplicity  of  the  digital  decoder  that  followed. 
This  has  come  into  question,  however,  and  there  have  been  many 
proposals  for  reducing  this  performance  loss  through  "soft" 
demodulation  followed  by  an  "extended"  decoder  which  has  been 
modified  to  take  advantage  of  the  additional  information  pro- 
vided by  the  demodulator. 

It  is  natural  to  assume  that  when  soft  demodulation  followed 
by  an  extended  decoder  is  employed,  the  complexity  of  decoding 
will  increase  significantly.  This  would  mean  that  the 
communication  system  designer  would  have  to  choose  between  two 
alternatives:  (1)  accept  the  information  loss  inherent  in  hard 

demodulation  but  use  a powerful  code  and  an  efficient  digital 
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algorithm  to  achieve  a net  performance  gain,  or  (2)  use  a weak 
code  and/or  an  incomplete  decoding  algorithm,  but  achieve  the 
same  net  gain  by  eliminating  the  information  loss  at  the  output 
of  the  demodulator.  In  the  spirit  of  questioning  all  assump- 
tions about  the  relationship  between  code  performance  and  decod- 
er complexity,  we  applied  heuristic  Rule  3 and  asked  ourselves 
the  following  question:  Does  decoder  complexity  really  increase 

drastically  if  we  remove  the  restriction  that  the  decoder  be 
digital? 

In  the  course  of  studying  the  previous  approaches  to  soft- 
decision  decoding,  we  became  aware  of  a curious  fact.  The  two 
best  known  techniques,  correlation  decoding  of  block  codes  and 
Viterbi  decoding  of  convolutional  codes,  although  almost  always 
used  to  decode  linear  codes,  make  no  essential  use  of  the  linear 
property.  This  seemed  to  us  to  be  a violation  of  our  heuristic 
Rule  2,  so  we  concentrated  on  the  question  of  how  the  algebraic 
structure  of  a linear  code  might  be  exploited  in  soft-decision 
decoding.  Posing  the  question  in  this  way  resulted  in  a break- 
through to  a new  area  of  coding  theory  which  we  are  now 
exploring.  We  have  found  that  by  using  a new  representation  of 
finite  fields,  classical  digital  decoding  techniques  can  be 
translated  directly  into  soft-decision  decoding  algorithms. 

This  strongly  supports  the  thesis  that  decoder  complexity  does 
not  increase  drastically  if  we  remove  the  restriction  that  the 
decoder  be  digital.  Furthermore,  the  properties  of  the  new 
algebraic  framework  allow  consideration  of  new  approaches  to 
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decoding  which  are  inapplicable  in  the  classical  digital  domain. 
Within  this  new  mathematical  framework,  in  which  finite- 


i 
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field  algebra,  combinatorics  and  the  theory  of  continuous 
functions  interact  in  a natural  way,  digital  decoding  and  analog 
demodulation  become  special  cases  of  a more  general  class  of 
decoding-demodulation  functions.  This  means  that  the  tradi- 
tional approach  of  treating  error-control  coding  as  a digital 
add-on  to  the  inherently  analog  modulation-demodulation  channel 
may  now  be  superseded  by  an  integrated  approach  in  which  de- 
modulation-decoding is  viewed  as  a single  unified  signal  pro- 
cessing function.  We  are  sure  that  the  ability  to  integrate  the 
decoding  and  demodulation  functions  will  have  great  impact  on 
the  design  of  future  high-performance  communication  systems. 

In  Section  2,  we  discuss  the  new  general  algebraic  frame- 
work which  provided  the  context  for  most  of  the  work  reported 
herein.  The  major  effort  within  this  context  has  been  the 
development  of  analog  threshold  decoding  algorithms  and  the 
results  of  that  effort  are  reported  in  Section  3.  Section  4 
presents  some  preliminary  results  of  a study  of  parity  check 
set  construction  methods  for  weighted  majority  decoding  of  lin- 
ear block  codes.  Conclusions  and  suggestions  for  further 
research  are  contained  in  Section  5. 
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Section  2 


ALGEBRAIC  ANALOG  CODING 

The  past  twenty-five  years  has  seen  the  growth  of  one  of 
the  most  elegant  and  esoteric  branches  of  applied  mathematics: 
algebraic  coding  theory.  Areas  of  mathematics  previously 
considered  to  be  of  the  utmost  purity  have  been  applied  to  the 
problem  of  constructing  error-correcting  codes  and  their 
decoding  algorithms  (to  the  point  where  the  very  concept  of 
"pure  mathematics'’  has  become  blurred ^ ) . Yet  in  spite  of 
the  impressive  theoretical  accomplishments,  very  little  alge- 
braic coding  has  be  a ut  into  practice. 

We  believe  that  a major  reson  for  this  is  that  communica- 
tion system  designers  tend  to  view  algebraic  coding  as  an  overly 
fancy  digital  add-on  to  an  inherently  analog  modulation- 
demodulation  system,  and  that  coding  is  more  trouble  than  it  is 
worth.  Anyone  who  has  attempted  to  improve  the  performance  of 
an  existing  communication  system  by  adding  a level  of  error- 
control  coding  can  certainly  sympathize  with  this  feeling.  It 
is  becoming  increasingly  clear  that  the  best  way  to  achieve 
widespread  acceptance  of  algebraic  coding  is  to  integrate  it 
with  the  modulation-demodulation  system  from  the  start. 

That  modulation-demodulation  and  encoding-decoding  are 
simply  two  aspects  of  the  overall  signal  design  - signal  pro- 
cessing problem  is  widely  recognized  now,  and  the  desirability 
of  a unified  approach  is  apparent^***.  The  modulation- 
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demodulation  and  encoding-decoding  systems  cannot  be  designed 
independently  of  one  another  without  incurring  a performance 
loss.  The  major  problem  occurs  at  the  receiving  end  of  the 
system  when  there  is  a mismatch  between  the  demodulator  and  the 
decoder.  The  solution,  clearly,  is  to  merge  the  demodulation 
and  decoding  functions  and  design  an  optimum  integrated  decoder- 
demodulator.  But  here  we  run  into  an  apples-and-oranges 
mathematical  modelling  problem. 

Consider  the  familiar  situation  in  which  a code  word  of  an 
(n,k)  linear  binary  error-correcting  code  is  transmitted  over  a 
time-discrete  memoryless  channel.  We  may  consider  the  channel 
to  be  a device  which  adds,  as  vectors  of  real  numbers,  an 
error  vector  to  the  modulator's  representation  of  the  code  word. 
The  code  word  was  selected  from  one  algebraic  domain,  the 
n-dimensional  vector  space  over  the  finite  field  GF(2),  and 
the  error  vector  from  another  algebraic  domain,  the  n-dimensional 
vector  space  over  the  real  numbers  R.  An  apple  has  been  added 
to  an  orange.  This  poses  a difficult  problem  at  the  receiver: 

In  what  domain  do  we  process  the  word  received  at  the  output  of 
the  channel? 

One  approach  is  to  force  the  error  vector  into  the  alge- 
braic domain  of  the  code  by  hard-decision  demodulation.  The 
quantized  error-vector  may  then  be  viewed  as  a 0-1  vector  which 
has  been  added,  modulo  2,  to  the  transmitted  0-1  code  word,  and 
all  of  the  techniques  of  finite-field  algebra,  number  theory 
and  combinatorics  may  be  employed  in  the  design  of  the  decoder. 
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We  might  call  this  the  digital  decoding  approach.  Virtually  all 
of  classical  algebraic  coding  theory  is  predicated  on  this  model. 
But  as  pointed  out  above,  this  approach  is  unsatisfactory  from  a 
practical  point  of  view  because  of  the  information  loss  at  the 
interface  between  the  hard-decision  demodulator  and  the  digital 
decoder. 

An  alternative  approach,  which  we  might  call  probabilistic 
decoding,  is  to  treat  the  code  word  as  if  it  came  from  the 
algebraic  domain  of  the  error  vector.  In  this  case,  the  algebra- 
ic properties  of  the  code  (linearity,  number-theoretic  propert- 
ies, etc.)  are  simply  ignored.  The  signal  processing  is  done 
entirely  in  the  error  vector  domain.  Two  well-known  examples  of 
this  approach  are  correlation  decoding  of  block  codes  and 
Viterbi  decoding of  convolutional  codes.  Both  methods  are 
normally  used  to  decode  linear  codes,  but  neither  method  makes 
any  essential  use  of  the  linear  property.  This  approach  is 
satisfactory  only  for  low  rate  or  short  codes. 

A third  approach  is  to  attempt  to  exploit  both  algebraic 
domains  by  combining  digital  and  probabilistic  decoding. 

(18) 

Examples  of  such  hybrid  decoding  schemes  are:  Wagner  decoding'  ' 

(19) 

generalized-minimum-distance  decoding  , weighted-erasure 
decoding^^*  and  decoding  with  channel-measurement  information^'*’^ 
Although  these  schemes  show  improvement  over  strictly  digital  or 
strictly  probabilistic  decoding  in  many  instances,  one  gets  the 
impression  that  the  apples-and-oranges  problem  remains  unresolved. 

As  a result  of  surveying  the  existing  decoding  techniques, 
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it  occurred  to  us  to  ask  whether  anything  could  be  done  about 
the  apparent  incompatibility  between  these  two  algebraic  do- 


mains. Clearly,  nothing  can  be  done  about  Mother  Nature's  error 
vector  domain,  but  what  about  the  man-made  algebraic  domain  of 
the  code?  This  line  of  inquiry  led  to  the  discovery  of  a more 
general  algebraic  domain  in  which  the  analog  error  vector  and 
the  digital  code  word  co-exist  in  a natural  way.  This  is 
achieved  through  the  use  of  a new  representation  of  finite 
fields  which  we  will  now  describe.  For  simplicity,  we  restrict 
our  discussion  to  fields  of  prime  order.  The  extension  to  fields 
of  prime-power  order  is  straightforward. 

The  finite  field  of  p elements,  GF(p) , is  usually  represent- 
ed by  the  ring  of  integers  mod  p.  We  will  call  this  the 
"additive  representation"  of  GF(p)  and  denote  it  by 
S = <S,  ©,  0>  where  S = { 0, 1 , . . . , p-1 } , and  "ffi"  and  "0"  are 
modulo  p addition  and  multiplication.  The  new  representation  of 
GF(p),  which  we  call  the  "multiplicative  representation",  will 
be  denoted  by  S'  = <S',  •,  *>  where  S'  = {1,  a,  a2,...,ap-1}  is 

the  set  of  complex  p^*1  roots  of  unity,  "."  is  ordinary  multipli- 
cation of  complex  numbers,  and  is  a new  operation  defined  by 


u * v = v 


log  u 


where  the  principal  value  of  the  logarithm  is  taken.  To  show 
that  S'  is  indeed  a representation  of  GF(p),  it  is  necessary 
only  to  establish  the  existence  of  an  isomorphism  from  S to  S ' . 
Thus  let  f be  any  function  from  S to  the  complex  numbers  C such 
that  for  all  i e S,  f(i)  = a1.  It  is  easy  to  verify  that  f is 

9 


such  an  isomorphism. 


The  important  point  is  that  the  operations  and  of 

the  multiplicative  representation  of  GF(p)  are  defined  for  all 
nonzero  elements  of  C,  not  just  the  p*"*1  roots  of  unity.'*'  We 
have  thus  constructed  a general  algebraic  system  <C,  +,  •,  *> 
which  contains  the  multiplicative  representation  of  GF(p). 

Every  algebraic  equation  that  can  be  written  in  the  classical 
S-domain  can  be  translated  directly  into  an  equivalent  equation 
in  the  S' -domain.  But  once  in  the  S' -domain,  the  algebraic 
equation  extends  immediately  to  non-digital  arguments  (i.e, 
arguments  which  are  not  restricted  to  be  p1"*1  roots  of  unity)  . 

This  will  be  discussed  in  the  next  section,  but  we  can  give 

a simple  explanation  here.  In  conventional  digital  decoding  of  a 
linear  (n,k)  code,  a parity  check  is  defined  by 

n 

s . = & h . . r . (mod  p) 

1 j = l ^ ^ 

where  (r.,...,r  ) is  the  received  word  and  (h.,»...,h.  ) is  a 
in  ll  xn 

word  in  the  dual  code.  It  is  assumed  here  that  r.  e S,  for,  if 

l 

not,  the  algebraic  operations  are  not  defined.  The  corresponding 
equation  in  the  S'  domain  is 


n 

n 

n h ! . *r ' 

= n 

3=1  13  3 

j=i 

] 

There  is  a technical  difficulty  with  the  definition  of  Z 
when  we  allow  Z to  be  complex  if  we  require  that  all  of  the 
usual  laws  of  exponents  hold  (which  in  our  development  we  do 
not) . For  a discussion  of  this  point,  see  A.  M.  Gleason, 
"Fundamentals  of  Abstract  Analysis,"  pp.  324-326,  Addison- 
Wesley  1966. 
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11 ' ' th 

(where  h| , = a 1-',  etc.)  . If  the  are  p roots  of  unity, 

then  there  is  nothing  new.  But  the  r'  need  not  be  so  restricted. 

1 

The  algebraic  operations  are  still  defined  when  the  r'  are  any 
nonzero  complex  numbers.  This  means  that  any  conventional 
"additive"  digital  decoder  can  be  translated  into  a correspond- 
ing "multiplicative"  decoder  and  then  applied  directly  to  the 
raw,  unquantized  received  word.  The  question,  of  course,  is: 
how  well  does  this  work?  As  will  be  seen  in  the  next  section, 
it  can  work  surprisingly  well,  but  we  know  far  too  little  at 
this  time  to  make  any  more  specific  statements.  We  are  in  the 
somewhat  embarrassing  position  of  having  a variety  of  new 
demodulation-decoding  techniques  which  work  remarkably  well, 
but  which  we  understand  only  marginally.  We  are  currently 
investigating  such  problems  as  interpreting  the  meaning  of  the 
syndrome  when  the  digital  restriction  is  removed,  and  determin- 
ing how  many  "multiplicative"  parity  checks  are  required  to 
specify  a code  in  this  more  general  domain.  We  are  studying 
the  interaction  of  probabilistic,  algebraic  and  combinatorial 
mechanisms  in  an  effort  to  find  the  proper  viewpoint  from  which 
to  make  sense  of  it  all.  To  date,  the  best  insights  have  come 
through  the  use  of  abstract  harmonic  analysis  (group  characters, 
finite  Fourier  transforms,  etc.).  At  this  point  we  know  that 
all  of  classical  algebraic  coding  theory  can  be  translated 
from  the  additive  domain  into  the  multiplicative  domain,  and 
that  once  ir.  this  new  domain  a bewildering  number  of  analog 
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processing  extensions  become  possible.  The  extension  which  we 
are  exploring  currently  is  discussed  in  Section  3. 
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Section  3 


ANALOG  THRESHOLD  DECODING 


3 . 1 Introduction 

Majority  logic  decoding  and  the  more  general  threshold 
decoding  constitute  widely  studied  areas  of  algebraic  coding 
theory^  7»22  40).  Majority  decoding  usually  takes  the  form  of 
a symbol-by- symbol  decoding  algorithm  for  linear  block  or 
convolutional  codes.  Most  majority  algorithms  make  strong  use 
of  both  the  linearity  of  the  code  and  any  special  combinatorial 
structure  the  code  may  have.  Because  of  the  principal  investi- 
gators' familiarity  with  the  area,  majority  decoding  was  the 
first  technique  to  be  translated  into  the  multiplicative  domain 
and  extended  to  analog  processing  of  the  unquantized  received 
word.  (Actually,  the  discovery  of  the  multiplicative  extension 

of  majority  decoding predated  the  discovery  of  the  general 

(29) 

multiplicative  algebraic  domain.)  Following  Massey  , we 
call  this  extended  decoding  method  "analog  threshold  decoding". 

In  the  previous  section,  we  pointed  out  that  any  function  f 
which,  for  all  i e S,  maps  i -*  a1  is  an  isomorphism  from  S,  the 
additive  representation  of  GF(p),  to  S’,  the  multiplicative 
representation  of  GF(p).  One  way  to  convert  an  additive  digital 
decoder  to  a multiplicative  analog  decoder  is  to  physically 
implement  the  function  f and  apply  it  to  the  output  of  the 
channel.  Our  first  experiment  along  these  lines  involved  a 
majority  decoder  for  a linear  binary  block  code  transmitted 
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over  a time-discrete  men.oryiess  channel  . The  initial  choice 

for  the  isomorphism  f was  f(x)  = cos  nx.  (In  the  binary  case, 

a = -1,  S = (0,1}  and  S'  = 11,-1}). 

In  conventional  one-step  majority  decoding,  the  unquantized 

received  word  (r r ) is  converted  to  a 0-1  vector 
l n 

) by  hard-decision  demodulation,  various  modulo  2 

parity  check  sums  involving  the  {u^}  are  computed  and  an  estimate 

of  c^,  the  ifc^  transmitted  code  digit,  is  obtained  by  majority 

decision  on  these  sums.  In  analog  threshold  decoding,  the 

received  word  (r.,...,r  ) is  converted  to  a real-valued  vector 
i n 

(cosnr j , . . . ,cosnrn) , the  corresponding  parity  check  products 
involving  the  {cosur^}  are  computed,  and  the  estimate  of  c^  is 
obtained  by  thresholding  on  the  sum  of  these  products. 

The  reader  might  well  ask  at  this  point  why  anyone  would 
choose  the  function  costt  to  be  the  isomorphism  from  S to  S'. 

A periodic  "soft-decision"  demodulation  function  hardly  makes 
sense  from  a communication  system  designer's  point  of  view. 

The  reasons  for  this  choice  are  of  historical  interest  only  and 
it  is  certainly  true  that  cos"  would  never  be  used  in  practice. 
However,  it  is  also  true  that  cosn  works  surprisingly  well,  and 
that  it  is  a convenient  function  to  work  with  from  a 
theoretician's  point  of  view  (i.e.,  theorems  can  be  proved). 

In  order  to  talk  about  the  performance  of  an  analog 
threshold  decoder,  we  have  to  define  error-correcting 
capability  over  the  real  numbers  R.  The  natural  distance 
measure  over  Rn  is  the  Euclidean  metric,  a nil  it  is  easily 
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verified  that  a binary  (n,k)  code  with  minimum  Hamming 

distance  d„  has  minimum  Euclidean  distance  d = /d77.  We  say 
H L H 

that  a decoding  function  is  a nearest-neighbor  decoding  rule 
if  it  maps  a received  vector  onto  a nearest  word  in  the  code, 
and  a radius-r  decoding  rule  if  it  maps  a received  vector  onto 
a nearest  word  in  the  code  whenever  the  vector  is  within 
Euclidean  distance  r of  a code  word.  The  maximum  radius  possible 
without  having  overlapping  spheres  is  r = , and  a decoding 

function  which  achieves  this  radius  is  called  a maximum-radius 
decoding  rule.  (Note  the  obvious  analogy  with  t-error-correction 
in  digital  decoding.) 

Using  the  demodulation  function  costt,  we  were  able  to  prove 

( 8 ) 

the  following  results  for  linear  binary  block  codes  . First, 

a one-step  orthogalizable  code  of  minimum  Hamming  distance 

d„  can  be  maximum-likelihood  decoded  by  a one-step  analog 

threshold  decoder  using  at  most  d__  parity  check  products. 

Second,  a Hamming  code  of  length  n = 2m-l  can  be  maximum-radius 

decoded  using  2m  ^ products.  Finally,  any  L-step  orthogonal iz- 

able  code  with  minimum  Hamming  distance  d can  be  maximum- 

H 

( 4 ) 

radius  decoded  by  a sequential  code  reduction  decoder  whose 
first  stage  is  an  analog  threshold  decoder  using  d„  products 
and  whose  remaining  stages  are  digital,  provided  that  the 
subcodes  used  in  decoding  are  all  capable  of  correcting  d -1  or 
fewer  digital  errors  (which  is  almost  always  the  case) . We 
also  showed  that  maximum-radius  decoding  - however  achieved  - 
is  asymptotically  optimum  for  the  white  Gaussian  channel.  More 
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recently,  we  have  been  able  to  extend  our  earlier  results  and 
show  that  whenever  it  is  possible  to  find  a set  of  R parity 
checks  which  all  check  the  i*"*1  position,  but  no  more  than  a 
of  which  check  any  other  position,  then  a one-step  analog 


threshold  decoder  will  do  radius-r  decoding  with  r = 

/ A 

(For  orthogonalizable  codes,  R = d -1  and 

Since  f = cos"  was  clearly  not  an  optimum  choice,  we  ex- 
perimented with  other  functions  in  an  effort  to  understand  what 
makes  a good  demodulation  function.  We  ran  computer  simulations 
for  the  white  Gaussian  charnel  using  a variety  of  functions  - 
including  cos"  - with  inconclusive  results.  The  analog  threshold 
decoders  consistently  outperformed  the  corresponding  digital 
majority  decoders,  but  a particular  function  would  be  better  for 
one  code  than  for  another,  or  would  perform  better  at  one  signal- 
to-noise  ratio  (SNR)  than  at  another.  It  was  not  until  we  began 
to  consider  the  possibility  of  adaptive  analog  threshold  de- 
coding that  the  optimum  function  was  found. 

In  the  case  of  a linear  binary  (n,k)  code,  the  optimum 
soft-decision  function  (optimum  in  the  sense  that  the  probability 
of  bit  error  is  minimized  over  any  time-discrete  memoryless 
channel  when  the  code  words  are  equiprobable)  is 

l+*(x) 

where  $ (x)  = Pr (x | 1 ) /Pr (x | 0)  is  the  likelihood  ratio.  This 
function  is  optimum  when  (1)  all  2n  ^ parity  check  products  are 
used  and  (2)  the  products  are  weighted  equally.  We  have  been 
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able  to  generalize  this  result  to  any  linear  block  or  convo- 
lutional code  over  GF(q)^13^. 

The  reader  may  have  noticed  that  the  function  f(x)  as  given 

is  not  an  isomorphism  from  S to  S ' . We  could  easily  make  it  so 

by  normalizing,  but  then  the  weights  assigned  to  the  parity 

products  would  become  functions  of  the  SNR.  When  the  function  as 

given  is  used,  the  contributions  of  the  various  parity  products 

are  automatically  scaled  according  to  their  reliability  at  the 

n-k 

SNR  on  the  channel.  At  high  SNR,  the  2 parity  products 
contribute  more-or-less  equally,  while  at  low  SNR  the  only 
significant  contributions  are  from  parity  products  which  corres- 
pond to  minimum  weight  words  in  the  dual  code.  It  is  interesting 
to  note  that  for  the  white  Gaussian  channel,  the  optimum  function 
approaches  a [-1,+1]  step  function  as  SNR  - °°.  Analog  threshold 
decoding  would  be  mathematically  equivalent  to  digital  majority 
decoding  if  the  step  function  were  actually  used.  This 
illustrates  very  nicely  the  fact  that  digital  decoding  is  a 
special  limiting  case  of  this  more  general  class  of  decoding  - 
demodulation  functions. 

The  discovery  of  the  optimum  soft-decision  symbol -by -symbol 
analog  threshold  decoding  algorithm  is  significant  because  the 
complexity  of  the  algorithm  varies  with  the  size  of  the  dual 
code  and  is  thus  inversely  related  to  code  rate.  This  decoding 
method  therefore  is  to  high-rate  codes  what  correlation  and 
Viterbi  decoding  are  to  low  rate  codes,  which  fills  an  important 
gap  in  the  arsenal  of  decoding  techniques.  But  even  more 
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significant,  perhaps,  is  the  concept  of  soft-decision  decoding 
in  the  dual  code  domain  itself.  In  classical  code  domain  de- 
coding, there  is  no  graceful  way  to  give  up  a small  amount  of 
performance  in  order  to  reduce  the  complexity  of  the  decoder. 

For  example,  one  cannot  discard  half  of  the  matched  filters  in  a 
correlation  receiver,  or  half  of  the  microcomputers  in  a Viterbi 
decoder.  The  effect  on  performance  would  be  disastrous.  This 
is  not  so  in  the  case  of  dual  code  domain  decoding.  If  we  were 
to  throw  away,  at  random,  half  of  the  parity  check  products  in 
an  optimum  analog  threshold  decoder,  we  would  not  expect  a 
significant  loss  in  performance.  The  reason  for  this  is  that 
the  dual  code  domain  expansion  of  the  decoding  function  is 
essentially  a Fourier  series,  and  even  a fairly  severe  truncation 
of  the  series  should  result  in  no  more  than  a small  overall  de- 
gradation of  performance,  the  loss  being  independent  of  the  code 

word  transmitted.  To  support  this  view,  we  cite  the  results  of 

(41) 

some  very  recent  simulations  carried  out  by  CNR,  Inc.  for 

the  (21,11)  code  on  the  white  Gaussian  channel.  Reducing  the 
number  of  parity  products  from  1024  to  6 resulted  in  a loss  of 
less  than  1 db  at  the  bit  error  rate  of  2 * 10“^. 

Much  remains  to  be  done  in  this  area,  particularly  on  the 
problem  of  suboptimum  analog  threshold  decoding.  We  still  do 
not  know  what  the  optimum  demodulation  function  is  when  a 
proper  subset  of  the  available  parity  checks  is  used,  or  even  if 
the  optimum  function  can  be  factored  into  an  adaptive  part, 
which  is  a function  of  the  SNR,  and  a fixed  part  which  is  a 
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function  of  the  set  of  parity  check  products  to  be  used.  How- 
ever, the  preliminary  findings  are  certainly  encouraging  and  we 
expect  that  this  line  of  investigation  will  continue  to  produce 
results  of  theoretical  and  practical  importance. 

We  now  present,  in  detail,  the  new  optimum  symbol -by- symbol 
decoding  rule  for  linear  codes. 

3 . 2 The  Optimum  Decoding  Rule 

For  convenience,  we  present  the  decoding  rule  for  linear 

block  codes.  The  extension  to  convolutional  codes  is  immediate 

and  will  be  obvious  from  the  examples. 

Let  c = (Cg,c^,  . . . ,c  j.)  denote  any  code  word  of  an  (n,k) 

linear  block  code  C over  GF(p)  and  c = (c ' , c ',...,  c ' ,)  the 

~J  DO  ]1  J,n~l 

jth  code  word  of  the  (n,n-k)  dual  code  C'.  A code  word  c is 
transmitted  over  a time-discrete  memoryless  channel  with  output 
alphabet  B.  The  received  word  is  denoted  by  r = ( r q , r ^ , . . . , r^  ^ ) , 
r^  e B.  The  decoding  problem  is:  given  r,  compute  an  estimate 

A 

c of  the  transmitted  code  symbol  c in  such  a way  that  the 
m m 1 

probability  that  c equals  c is  maximized.  Other  notation: 

m m 

w 3 exp  [ 2 tt  /^T/p]  (primitive  complex  pth  root  of  unity); 

6^ j = 1 if  i = j and  0 otherwise;  Pr(x)  is  the  probability  of  x 
and  Pr(x|y)  is  the  probability  of  x given  y.  Unless  otherwise 
stated,  the  elements  of  GF(p)  are  taken  to  be  the  integers 
0,1,..., p-1  and  all  arithmetic  operations  are  performed  in  the 
field  of  complex  numbers. 
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DECODING  RULE: 


Set  c -=  s,  where  s e GF(p)  maximizes 
m 


A (s) 
m 


the  expression 

, n-k 

p-1  . p 

v -st  v 

) w } 

t=0  j=l 


n-1  p-1  -i ( c ! »-t6  . ) 


n } uj 

<1=0  i=0 


ji  m*. 


Pr (rf I i) 


1) 


Theorem : Decoding  Rule  (1)  maximizes  the  probability  that 


c = c . 
m m 


(Proof)  We  must  show  that  choosing  s to  maximize  A (s)  is 

m 

equivalent  to  maximizing  the  probability  that  cm  equals  s given 
the  received  word  r.  We  do  this  directly  by  shewing  that 
Pr(c  = sir)  = XA  (s),  where  X is  a positive  constant  which  is 
independent  of  s.  We  first  note  that 

Pr(c  =s|r)  = y Pr (c | r) 

ceC,c  =s 
~ m 


y Pr (r | c) [Pr (c) /Pr (r) ] . 


(2) 


ceC,c  =s 
~ m 


— 

Since  the  code  words  of  C are  equiprobable , Pr(c)  = p and 
(2)  becomes 

Pr(cm=sl£)  = [p"k/Pr(r)l  y Pr (r | c)  6 _ , (3) 

ceC  ' ~m 


where  e - (6  n, 6 , . .)  is  the  vector  with  1 in 

~m  mO  ml  m, (n-1) 

the  m*"*1  position  and  0 elsewhere.  In  terms  of  their  finite 
Fourier  transforms, 
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(4) 


Vlc-^-s)  - p t=0 


, p-1  t(c-e  -s) 

-x  r-  ~ 


P (r|c)  = p y F (r , u) ~ 
urV 

n 


(5) 


where 


F (r ,u)  = l Pr ( r | v ) w ~ ^ , 


(6) 


veV 
~ n 


u = (Uq'Uj^  • • • »un_1)  and  v = ( v0 ' vi ' * • • ' vn_i } anY  elements  of 
V » the  vector  space  of  all  n-tuples  over  GF(p).  Substituting 
(4)  and  (5)  in  (3)  yields 


Pr (c  =s | r) = [p-n-k_1/Pr (r ) ] \ 


ceC 


y F(r,u)(u~’~ 

ueV 
~ n 

p-1  t(c*e  -s) 

r ~ ~m 

l 

t=0 


p-1 


= [p  n k 1/Pr(r)]  y co“St  y F ( r , u) 

t=0  ueV 
~ n 


J 

y u, 

ccC 


(u+te  ) *c 

~ ~m 


(7) 


By  the  orthogonality  properties  of  group  characters,  we  know 
that 


r VC 
} ~ = 
ceC 


Applying  (8)  to  (7)  gives 

-n-1 


if  veC' 


0 otherwise  . 


Pr(cm=s|r)=[p  /Pr(r)]  l 


P;1  - 


st 


t=0 


n-k 

y F ( r , c -te  ) . 

j=i  ~ ~m 


(8) 


(9) 


Since  the  channel  is  memoryless,  we  may  write  (6)  as 

n-1  -u.  v.  n-1  p-1  -iu 

F(£,u)  = y n Pr(r.|v})a)  = n } Pr(r0|i)a> 

veVn  9 = 0 4*0  i=0 


(10) 
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Substituting  (10)  in  (9)  yields 

-n-i  P~ 1 _ . pn  k T n-1  p-1 

Pr(c  =s|r)  = [p  _1/Pr(p))  J u‘  J II  ] <u 

t=0  j = l 11=0  i=0 


n-1  p-1  -i(c'  -t6  ) 

n V 3 s-  nit 


Pr (r £ I i)  I = [p~n_1/Pr(r) }Am(s)  . 


Q.E.D. 


As  one  might  expect,  the  decoding  rule  takes  a comparative- 

A 

ly  simple  form  in  the  binary  case:  set  c = 0 if 

m 

A 

A (0)  > A (1)  and  c = 1 otherwise.  It  is  more  convenient  how- 
m m m 

ever  to  state  the  rule  in  terms  of  the  likelihood  ratio 

<t  = Pr  (r  1 1)  /Pr  (r  |0)  . 
m m ' m 

Substituting  (1)  into  the  inequality  A (0)  > A (1)  yields 

m m 

1 2n_k  n-1  1 -i(c'  -t6  ) 

I y n f (-1)  31  Pr(rJi)  > 

t=0  j*l  1=0  i=0  x 

1 2n"k  n-1  1 -i(c'  -t6  .) 

y (-D  y n y <-d  3?  m*  Pr(r  \L)  t 

t=0  j = l 8.=0  i=0  4 

or 

2n  k n-1  f -c'  -6  „ 

n I Pr  (r . 1 0)  + (-1)  3X  mlPr(r.|l)  > 0 . (11 

j*l  1*0  ^ 1 * 

n-1 

Dividing  both  sides  of  (11)  by  n Pr(r0|O)  and  using  the 

1=0 

definition  of  the  likelihood  ratio,  we  have 


n-1  [■ 

TT 


-c'  -6 


y n i+<t>.(-i)  il  ml 

j=i  i=o  x 


> 0 . 


Then  dividing  both  sides  of  (12)  by  the  positive  quantity 


n i+$ 


iN  - 
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> 0 . 


,n-k 


-c ' - 6. 


2"  " n-1  1+<K  (-1)  ml 

y n — 1 

j = l 11=0 


Finally,  using  the  identity 


1 + <J>„ 


] Cj**6nH, 

rR 


where  '®'  denotes  modulo  2 addition,  we  obtain  the 


BINARY  DECODING  RULE: 


Set  C = 0 if 
m 


2n'k  n-1 

y n 

j = l 1=0 


!-♦« 

1+4) 


c'  ©6  . 
j l mi 


> 0 


(13) 


and  c =1  otherwise, 
m 


We  remark  that  up  to  this  point  we  have  ignored  the 
question  of  how  one  retrieves  the  decoded  information  symbols 

A 

from  the  code  word  estimate  c.  This  could  be  a problem  because, 
when  a symbol -by- symbol  decoding  rule  is  used,  c is  not  in 
general  a code  word.  In  the  case  of  block  codes,  we  could  in- 
sist that  the  code  be  systematic  without  loss  of  generality, 
but  there  might  be  some  objection  to  this  restriction  in  the 
case  of  convolutional  codes.  As  it  turns  out,  this  is  not  a 
problem  since  the  decoding  rule  is  easily  modified  to  produce 
estimates  of  the  information  symbols  directly  if  need  be. 

Simply  note  that  every  information  symbol  am  can  be  expressed  as 

a linear  combination,  over  GF(p),  of  code  words  symbols  c , 

m 
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i.e.  = $bm5c5,  bmff  e GF(p),  and  that  the  proof  of  the  theorem 


mil  ll 


mil 


r X ^ 

goes  through  intact  if  we  substitute  )b  fc?  for  cm  and  bm5  for 


m 


m ■. 


6m?  in  (1). 


EXAMPLES : 


(a)  (7,4)  Hamming  code 

We  will  illustrate  the  decoding  rule  for  the  received 

symbol  r . Since  the  (7,4)  code  is  cyclic,  r ,...,r  may  be 
U 16 

decoded  simply  by  cyclically  permuting  the  received  word  r in 
the  buffer  store. 

The  Binary  Decoding  Rule  (13)  in  this  case  becomes 

Cj£®60 


8 


6 

n 


i-< 


•n  = ° iff  y ..  - v. 

0 j = l H=0  \ r / 


> 0 


(14) 


The  parity  check  matrix  H of  the  (7,4)  cede  and  its  row  space 
C'  are  shown  below. 


1 

1 

1 0 

1 

0 0 

(a) 

co 

C1 

C2 

C3 

C4 

C5 

C6 

H = 

0 

1 

1 1 

0 

1 0 

(b) 

0 

0 

0 

0 

0 

0 

0 

.0 

0 

1 1 

1 

0 1 

(c) 

1 

1 

1 

0 

1 

0 

0 

(a) 

C'  : 

0 

1 

1 

1 

0 

1 

0 

(b)  (15) 

1 

0 

0 

1 

1 

1 

0 

(a®b) 

0 

0 

1 

1 

1 

0 

1 

(c) 

1 

1 

0 

1 

0 

0 

1 

(a®c) 

0 

1 

0 

0 

1 

1 

1 

(b®c) 

1 

0 

1 

0 

0 

1 

1 

( a®b®c ) 

Let  i 

Dd 

= 

(1- 

♦l 

)/(l+*£) . 

Then 

substituting 

(15) 

into 

(14)  gives 

24 


(16) 


c0  = 0 lff  p0  + P1P2P4  + P2P5P6  + P1P3P6  + P3P4P5  + 

+ p0°l °2P3P5  + P0°2P3P4P6  + P0P1P4P5P6  > 0 * 

The  decoder  configuration  corresponding  to  (16)  is  shown  in 
Figure  1. 

The  reader  will  probably  recognize  the  similarity  between 

the  decoder  of  Figure  1 and  a one-step  majority  decoder  using 

(31) 


non-orthogonal  parity  checks 


And  in  fact  if  the  "soft 


decision”  function  (1-<J>  (x)  ) / (l  + 4>  (x)  ) were  replaced  by  the  "hard 
decision"  function  f(x)  = -1  if  x > j and  +1  otherwise,  and  if 
the  last  three  parity  checks  in  the  decoder  were  deleted,  then 
the  resulting  circuit  would  be  mathematically  equivalent  to  a 
conventional  one-step  majority  decoder.  Parity  checks  in  the 
circuit  of  Figure  1 would  be  computed  by  taking  products  of  +l's 
and  — l's,  rather  than  by  taking  modulo  2 sums  of  0's  and  l's 
as  would  be  the  case  in  a conventional  digital  decoding  circuit, 
(b)  (4,3,3)  convolutional  code 

We  now  illustrate  the  decoding  rule  for  the  received  symbol 

Tq  using  an  (nQ,kg,m)  = (4,3,3)  convolutional  code  (from 

(42) 

Peterson  and  Weldon  , page  395) . 

The  Binary  Decoding  Rule  (13)  in  this  case  becomes 

I 

c „ 0 6 . 


oo  oo 


:0  = 0 iff  l n 

0 j=l  1=0 


1-c 


l 


jl  01 


1 + 4 


> 0 . 


(17) 


Of  course,  there  are  only  a finite  number  of  nonzero  terms  in 
(17),  the  number  depending  upon  the  length  of  the  transmitted 
code  sequence.  The  initial  portions  of  the  parity  check  matrix 
H of  the  (4,3,3)  code  and  its  row  space  C'  are  shown  below. 
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11110 


(a) 


H = 


101011110... 

11001010111 


10... 


(b) 

(c) 


0 . . . 
1111 
10  10 
0 10  1 
C' : 1100 

0 0 11 
0 110 
10  0 1 


0 ... 

11110... 

11110... 

101011110. 

101011110. 

010111110. 

010111110. 


(a) 

(b) 

(a®b) 

(c)  (18) 
(a®c) 

(b®c) 

(a®b®c) 


As  before,  let  = ( 1— 4> ^ ) / (1  + 4)^)  . Then  substituting  (18)  into 
(17)  gives 

c0  = ° iff  pQ  + + P2P4P5P6P7  + 

+ pqpip3p4  P5P6P7  ^ 0 . (19) 


The  decoding  diagram  corresponding  to  (19)  is  shown  in  Figure  2. 

This  takes  the  form  of  a trellis  diagram  for  the  (4,1,3)  dual 

code  C'  with  the  cj^  positions  in  the  branch  labels  complemented. 

(In  general,  to  decode  r the  c'.  positions  would  be  comple- 

m jm  r 

mented.)  Note  that  the  all-zero  state  acts  as  the  accumulator 
for  the  terms  of  (19). 
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Figure  2.  Decoder  for  the  (4,3,3)  code. 


Since  a different  storage  unit  must  be  used  for  each 

symbol  to  be  decoded,  the  amount  of  storage  for  this  type  of 

decoder  grows  linearly  with  the  length  of  the  transmitted  code 

sequence.  This  is  also  true  of  a Viterbi  decoder,  which  must 

keep  track  of  its  path-elimination  decisions. 

We  now  restrict  our  attention  to  linear  binary  block  codes 

with  equiprobable  code  words  transmitted  over  the  additive  white 

Gaussian  noise  channel  by  antipodal  signalling  and  present 

asymptotic  expressions  for  the  probability  of  bit  error,  p _ , 

BxT 

for  both  high  and  low  SNR. 

3 . 3 Asymptotic  Results  for  the  WGNC 

t h 

The  optimum  bit-by-bit  decoding  rule  for  the  in  digit  of 

/N 

an  (n,k)  linear  binary  block  code  C is:  set  c = s,  where 

m 

s e GF ( 2 ) maximizes  P(c  = sir).  Here  c = (c_,...,c  .)  is  the 

m ~ ~ U n-i 

transmitted  code  word,  r = (r.,...,r  .)  is  the  received  word, 

~ u n-i 

A 

and  cm  is  the  decoder's  estimate  of  the  transmitted  code  digit 
c . The  probability  of  bit  error  is  then  given  by 

PBIT  ’ P|P(om  * cmlE>  i p(=m  * SjE’1  • 

The  derivation  of  our  results  is  simplified  by  assuming  that  the 

all-0  code  word  is  transmitted.  It  is  easily  seen  that  the 

assumption  is  valid  for  the  case  under  consideration  because  of 

the  group  property  of  the  code,  which  renders  the  view  of  n- 

space  from  one  code  word  the  same  as  from  another,  and  because 

of  the  symmetry  of  the  noise,  from  which  it  follows  that  noise 

vector  e = (en,...,e  .)  occurs  when  c = (c„,...,c  ,)  is  trans- 

~ u n-i  ~ 0 n-1 
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c0  C -1 

mitted  with  the  same  probability  that  e'  = (-1  e„,...,-l  n e 

0 n-1 ) 

occurs  when  the  all-0  code  word  word  is  transmitted. 

When  the  all-0  code  word  is  transmitted,  the  m*"*1  component 
of  the  received  word  is 

r = /E  + e , 
m m 

where  E is  the  signal  energy  per  channel  bit  and  em  is  a noise 

sample  of  a Gaussian  process  with  single-sided  noise  power  per 

hertz  Nrt . The  variance  of  e is  N./2  and  the  SNR  for  this 
0 m 0 

channel  is  y = E/Ng.  In  order  to  account  for  the  redundancy  in 

codes  of  different  rates,  we  will  use  the  SNR  per  transmitted 

bit  of  information,  y^  = E^/Ng  = yn/k  = y/R,  in  our  derivations. 

In  terms  of  y,  , we  can  write  the  likelihood  ratio  as 
b 

(r  +/E)  2 (r  -/E)  2 

$(rm}  = expl § ) / exp  ( £ ) 


N 


0 


= exp(-4rni/y/N0)  = exp  ( -4rm /Ryb/NQ ) . 


(1) 


The  m*"*1  component  of  the  receive  word  r will  be  decoded 
incorrectly  if  and  only  if 

P(cm  = Ojr)  <_  P(cm  = 1 1 r ) , 

where  r = ( /E  + eg»...,/E  + en_^) • In  other  words, 


BIT 


= P[  l P (c  | r ) < }’  P (c  | r)  ] , 


(2) 


CCS, 


ccS, 


'0  ~ 1 

where  S^  = fc  e C | c^  = i},  i = 0,1.  Since  the  channel  is  memory- 
less, and  the  code  words  are  equiprobable  (so  that  we  may  invoke 
Bayes'  formula),  (2)  may  be  written  as 
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(3) 


n-1  n-1 

> IT  - P[  l n p(r  |c  ) < y n P(r  |c  )]  . 
BIT  ceS0  11=0  16  ceS1  11=0  16  16 


n-1 

Since  IT  P(r,|0)  > 0 , we  can  rewrite  (3)  in  terms  of  the  like- 

R = 0 

lihood  ratio  as 


BIT 


n-1  n-1 

= P[  y n <D(r  ) < y n <t>  (r  ) ] . 

ceS.  2=0  ceS.  2=0 


(4) 


Substituting  (1)  into  (4)  yields 

P = P[  y exp ( -r • c 4/Ry  /N  ) £ l exp(-r-c  4 /Ry  /N.) ] . (5) 

BIT  ceSQ  ~ ~ D ° C£S1 


We  now  fix  NQ  and  vary  Eb  to  obtain  the  asymptotic  behavior  of 


BIT 


as  y^  decreases  (denoted  by  A.B.  anc^  as  increases 


v° 


(denoted  by  A.B.  (PQIT)). 


Low  SNR  Case 

For  x small,  exp  x «=  1 + x,  so  for  y^  in  a small 
neighborhood  of  zero  we  may  write  (5)  as 


A . B . ( P _)  * P[  y (1-r-c  4/Ry  /N“  ) 

VO  c eS_  b 0 

b u 


£ ) (1-r-c  4 /Ry, /n.)  ] . 

ceS^^ 


(6) 


By  [42,  problem  3.5],  (6)  can  then  be  written  as 

A.B.  (P_T(r)  * P[  y -r-c  4 /Ry  /N  n < ] -r-c  4/Ry  /N  ] , 

V°  BIT  cis0  b ° ' ccS1  b 0 

which  implies  that 


A.B.  (P„Tm)  = P(  y r-c  l l r-c]  . 

CtS0  cgS1 


(7) 
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Sg  is  a linear  binary  (n,k-l)  code,  so  if  the  vectors  of  Sg  are 

arranged  as  rows  of  a matrix  Mg,  then  each  column  will  contain 

k-2  k-2 

all  0's  or  else  2 0's  and  2 l's.  Then  if  we  arrange,  as 

rows  of  a matrix  M^,  the  vectors  of  set  S^,  the  columns  of 

which  correspond  to  all-0  columns  of  Mg  must  contain  all  l's, 

k-2  k-2 

and  all  other  columns  of  M^  will  contain  2 0's  and  2 l's. 

Using  this  fact,  we  can  write  (7)  as 


A.B.  IP  ) 

V° 


J0 

i=j. 


r9  < 0] 


where  j ^ , . . . , j , are  the  columns  of  Mg  which  contain  all  0's. 
Since  r is  a normally  distributed  random  variable  with  mean 
/E  and  variance  Ng/2,  and  since  the  noise  is  white. 


PI  I 
«•=  j 


1 01 


•A  J 


l/20Ry, 


exp(-x  /2)dx  = Q(/20RY]3)  . 


The  desired  asymptotic  expression  for  low  SNR  is  thus 


A.B.  (P  ) * Q(/20Ry  ) . 

yb-° 


0 


We  note  that  if  C is  the  binary  (n,l)  code,  then  R = 1/n, 
n and 


A.B.  (P  ) * Q ( v* 

Vb-° 


'b» 


which  is  the  probability  of  bit  error  when  no  coding  is  used. 

Also,  we  note  that  if  the  dual  code  of  C has  minimum  Hamming 

distance  greater  than  2,  then  0=1  and 

A.B.  (PnTT)  * Q(/2RyT)  , 

Vb'°  BIT  b 
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which  is  the  probability  of  bit  error  before  decoding. 

High  SNR  case 

Since  the  all-0  code  word  is  a member  of  Sq, 
y exp(-r*c  4/Ry./Nn)  > 1 

scSo 

in  (5).  In  the  case  r*c  > 0 for  all  c e S^, 

y exp ( -r • c 4 /Ry  /N  ) ~ 0 

SEs1  ' ' b 0 

for  large  values  of  y . Thus  A . B . (P  ) is  upper  bounded  by 

D v -»oo  BIT 

' b 

A.  B.  ( P _ ) < P[  u r-c  <_  0]  £ y P ( r • c < 0)  . 

Y,  ceS.  ccS. 

b ~ 1 ~ 1 

We  now  show  that  this  upper  bound  is  tight  for  sufficiently 

large  values  of  Y^*  i.e. 

A.B.  (P  * J P(r -c  < 0)  . 

V S£S1 

Let  c^  and  c^  be  two  code  words  of  c.  First  note  that 
r*c^  = e • c^  + /Ew(c^)  and  r • = e-c2  + ^wl^)  , where  w(c)  is 

the  Hamming  weight  of  c.  Without  loss  of  generality,  we  assume 
that  w(g^)  w(g2).  Let  be  the  solution  to  the  problem  of 

minimizing  g-g  subject  to  the  constraints  g.c^  + /Ewtc^)  £ 0 
and  g’C^  + /Ew(c2)  <_  0,  and  let  e2  be  the  solution  to  the  pro- 
blem of  minimizing  e*e  subject  to  e-c^  + ^w(c^)  0.  It  is 

easy  to  show  that  - e2-g2  = Ez,  z > 0,  from  which  it 

follows  that  for  sufficiently  large  values  of  E(and  thus  Y^) , 

P(r-Ci  < 0)  >>  Pfr-^  < 0,  r-c2  < 0)  . (8) 
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But  then 


P[  <J  r • c <_  0]  y P(r-c  <_  0)  , 

ceS^  ceS^ 


since,  with  high  probability,  only  one  of  the  inner  products 
r-c  will  make  a significant  contribution.  Again  by  (8),  we  may 
conclude  that  for  sufficiently  large  values  of  y^f 

A.B.  (P_T_)  * P(  u r-c  < 0)  . 

y -*<*>  BIT  ceS. 

b ~ i 

Thus 


A.B. 

Y -o 


(PBIT) 


y 

ceS, 


P (r-c  < 0) 


Now  we  know  that  „ 

oo 

P(r-c  £ 0)  = exp(-x2/2)dx  = Q(  /2Rw(c)  Yb)  , 

/2Rw(c)  Yb 

from  which  it  follows  that 

A.B.  P(___)  * I Q(/2Rw(c)  y ) . 

Yu—  ceS.  b 

b I 


Let  w be  the  minimum  weight  of  code  words  in  S.  and  N(w  ) the 
m 1 m 

number  of  code  words  of  weight  w . Since  only  these  code  words 

^ m 

make  a significant  contribution  to  the  sum,  the  desired  asympto- 
tic expression  for  high  SNR  is  seen  to  be 


A.B.  (PBIT)  = N(w  ) Q(/2Rw^)  • 


If  the  code  is  cyclic  with  minimum  distance  d,  then  w = d 
1 m 


and 


A.B.  (PnTq,)  * N (d ) Q(/2Rdy.  ) 

y -*oo  ® ^ 

Yb 
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Section  4 


PARITY  CHECK  SET  CONSTRUCTIONS 
(39) 

It  is  known  that  any  digital  decoding  function  for  a 

linear  binary  code  can  be  realized  as  a weighted  majority  of 
nonorthogonal  parity  checks.  An  open  question  of  practical 
interest  is:  For  an  (n,k)  linear  block  code,  how  do  we  find  a 

n-k 

small  subset  of  the  2 available  parity  checks  that  is  capable 
of  correctly  decoding  the  first  received  bit  in  spite  of  any 
pattern  of  t or  fewer  errors?  In  this  note  we  present  two 
approaches  to  constructing  such  subsets.  The  first  approach, 
which  applies  to  cyclic  codes  only,  is  based  on  "squaring",  an 
automorphism  of  any  binary  cyclic  code.  The  second  approach, 
which  is  applicable  to  any  linear  binary  code,  is  based  on  a 
"measure  of  reliability". 

THE  SQUARING  APPROACH : 

Let  R^  be  the  ring  of  polynomials  modulo  xn-l  over  GF(2). 

A cyclic  code  C of  block  length  n is  an  ideal  of  R . The 

generator  of  C,  g(x),  is  a divisor  of  xn-l. 

Let  II  denote  the  permutation  k -*•  2k  (mod  n)  of 

{ 0 , 1 , . . . , n-1 } . F!„  induces  the  "squaring"  automorphism  of  R 

z n 

n-l  . n-1 

II,  ( y a . x1)  = y a.x  , a.  c GF(2)  . 

2 i=0  1 i-0  1 1 

Since  the  square  of  a multiple  of  g(x)  is  also  a multiple  of 
g(x),  a binary  cyclic  code  is  invariant  under  the  operation  of 
squaring,  and  moreover  the  square  of  a code  word  of  Hamming 
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weight  w is  another  code  word  of  weight  w.  The  set  of  all  code 
words  obtained  by  squaring  a code  word  v is  called  the  "square 
set  of  v" . 


Let  E^ , . . . , E ^ denote  the  cycles  of  II  and  let  v denote  the 
set  of  position  indices  where  the  code  word  v z C has  ones. 

Define  a function  f(v)  = (n,,...,n.),  where  n.  = |E.  n v|.  It 
is  easily  shown^43^  that  f(v)  = f(v2)  for  any  v e C.  A straight- 
forward application  of  this  property  yields  the  following  theorem 
on  M , the  incidence  matrix  of  the  square  set  of  v: 

Theorem : 

Let 


“l 


2 (m-1) 
v 


be  the  incidence  matrix  of  the  square  set  of  v and  suppose 

f ( y ) = (n.  , . . . , n„ ) , where  n.  = |e.  n v|.  Then  the  columns  of 
i *.  3 3 

corresponding  to  the  components  of  E ^ each  contain  exactly 

nin  . 

; l's»  where  m is  the  multiplicative  order  of  2 (mod  n) . 

J 


Example  1 

Suppose  n = 7.  Then  E-j^  = (0),  E2  = (1,2,4),  E3  = (3,6,5). 

Let  C be  the  (7,6)  code  and  v = (0,1).  Then  v2  = (0,2),  v4  = (0,41  , 
8 2 4 

v = v and  f(v)  = f(v  ) = f(v  ) = (1,1,0).  The  incidence  matrix 
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of  the  square  set  of  v is 


M = 

V 


0 12  3 4 
110  0 0 
10  10  0 

1 0 0 0 1 


5 6 
0 0 
0 0 
0 0 


Now  let  C be  the  dual  of  the  code  we  wish  to  decode  and  let 
v and  v'  be  code  words  of  C with  ' 1'  in  the  first  position. 

Assume  f(v)  = ( 1 , n2 , . . . ,n^)  and  f(v')  = (1 ,n£ , . . . , n ’ ) and  that 
f(v)  ^ f(v').  If  it  is  possible  to  find  integers  a > 0 and 
b > 0 such  that  an,  + bn.!  <_  X for  j = 2,...,  I,  then  the  parity 
checks  of  the  square  set  of  v,  replicated  a times,  can  be 
combined  with  the  parity  checks  of  the  square  set  of  v',  re- 
plicated b times,  to  obtain  a nonorthogonal  parity  check  set  in 
which  the  first  error  position  eg  is  checked  by  all  of  the  parity 
checks,  but  no  other  position  is  checked  by  more  than  \ parity 
checks . 


Example  2 (17,9)  code 

The  (17,9)  double-error-correcting  quadratic  residue  code 
is  not  L-step  orthogonalizable . However,  this  code  can  be 
weighted-majority  decoded  in  one  step  using  15  nonorthogonal 
parity  checks  as  we  now  show. 

For  n = 17,  E1  * {0},  E2  = {1,2,4,8,16,15,13,9}  and 
= {3,6,12,7,14,11,5,10}.  Two  code  words  of  weight  6 in  the 
(17,8)  dual  code  for  which  f(v)  f f(v')  are: 

v = (0, 1,3, 6, 8, 9}  f ( v ) = (1,3,2) 

v’  = {0,4,5,6,7,11}  f(v')  = (1,1,4)  . 
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If  'a'  is  the  weight  (number  of  replications)  assigned  to  the 
square  set  of  v and  'b'  is  the  weight  assigned  to  the  square  set 
of  v'f  then  the  following  equations  must  be  solved  for  a,b,  and 
X : 

3a  + b = A 
2a  + 4b  = X 

The  simplest  solution  is  a = 3,  b = 1,  X = 10.  There  are  8 

words  in  each  square  set  and  each  word  checks  the  first  position. 

(441 

Applying  Ng ' s bound  with  r = 3(8)  + 1(8)  = 32  and  * = 10,  we 

see  that  it  is  possible  to  decode  the  first  received  bit  in 

A — 1 

spite  of  any  pattern  of  t - — = 2 or  fewer  errors.  This 
nonorthogonal  parity  check  set  and  its  associated  weights  are 
shown  in  Table  1.  Note  that  since  the  minimum  weight  of  any 
syndrome  of  an  error  pattern  with  eQ  = 1 is  2 greater  than  the 
maximum  weight  of  any  syndrome  of  an  error  pattern  with  eQ  = 0, 
the  last  check  may  be  discarded,  thereby  reducing  the  number  of 
nonzero  parity  checks  required  to  15. 

THE  MEASURE  OF  RELIABILITY  APPROACH: 

Another  approach  to  selecting  a parity  check  set  is  to 
define  a measure  of  the  "reliability"  of  a parity  check  and 

n _ V 

then  use  this  measure  to  select  a subset  of  the  2 checks 
available.  One  such  measure  of  reliability  is  the  absolute  value 
of  the  number  of  times  a parity  check  is  "right"  minus  the  number 
of  times  it  is  "wrong"  over  the  set  of  error  patterns  of  interest, 
where  we  say  that  a parity  check  is  "right  for  an  error  pattern 
e = (e^ , . . . ,en_^)  if  the  check  sum  is  equal  to  eg,-  otherwise  it 
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is  "wrong".  The  error  patterns  of  interest  in  the  examples 
to  follow  are  those  of  weight  t or  less,  where  t is  the 
guaranteed  error-correction  capability  of  the  code.  The  general 
idea  is  to  use  the  parity  checks  with  the  highest  reliability 
coefficients.  When  a parity  check  is  "wrong"  more  often  than  it 
is  "right",  we  usually  use  the  complement  of  the  check  (denoted 
by  a minus  sign  assigned  to  the  weight) . 

Word  in  the  (17,8)  Dual  Code 

00000000000000000 
11010010110000000 
11100010000010001 
10101001000010010 
10001001100001100 
10000000110100101 
11000100000100011 
10100100001001010 
10011000011001000 
10001111000100000 
10000100101010100 
10010001001100001 
10010110000000110 
10000010001111000 
10010101010010000 
11000011001000100 
10110000000110100 

Table  1.  Parity  check  set  used  to  decode  the  (17,9)  code 
Example  3 (15,5)  code 

The  (15,5)  triple-error-correcting  cyclic  Reed-Muller  code 
can  be  majority  decoded  in  two  steps  using  42  orthogonal  parity 


Assigned  Weight 

10 

3 

3 

3 

3 

3 

3 

3 

3 

1 

1 

1 

1 

1 

1 

1 
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(22  29) 

checks  ' . We  now  show  that  this  code  can  be  weighted 

majority  decoded  in  one  step  using  31  nonorthogonal  parity  checks. 

The  (15,10)  dual  code  has  the  following  weight  distribution: 

Code  word  weight  04  6 8 10  12 

Number  of  code  words  1 105  280  435  168  35 

The  error  patterns  of  interest  in  this  case  are  the  patterns  of 
3 or  fewer  errors,  and  the  reliability  coefficients  of  the  parity 
checks  with  respect  to  this  set  of  error  patterns  are  as  given 
in  Table  2.  The  zero  parity  check  is  the  most  reliable,  the 
weight-12  parity  checks  which  do  not  check  e^  are  the  second 
most  reliable,  and  the  weight-4  parity  checks  which  check  e^ 
are  the  third  most  reliable.  (We  use  the  complements  of  the 
second  set  of  check  sums  since  they  are  "wrong”  more  often  than 
they  are  "right".)  The  nonorthogonal  parity  check  set  formed 
from  these  three  sets  of  checks  and  the  associated  weights  are 
shown  in  Table  3.  (A  negative  weight  indicates  that  the 
complement  of  the  check  sum  is  to  be  used.)  It  is  possible  to 
discard  the  first  four  nonzero  checks,  thereby  reducing  the 
required  number  of  nonzero  parity  checks  to  31. 
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Code  Word  Weight  No.  of  Words  of  That  Weight  Reliability  Coefficient 


Words  in  the  (15,10)  Dual  Code 


Assigned  Weight 


000000000000000  6 

111000000010000  1 

110100010000000  1 

110001001000000  1 

110000100000100  1 

110000000100001  1 

110000000001010  1 

101100000000010  1 

101011000000000  1 

101000100000001  1 

101000010001000  1 

101000000100100  1 

100110000100000  1 

100101000000100  1 

100100101000000  1 

100100000011000  1 

100010100001000  1 

100010010000001  1 

100010001010000  1 

100010000000110  1 

100001110000000  1 

100001000100010  1 

100001000001001  1 

100000100110000  1 

100000011000100  1 

100000010010010  1 

100000001101000  1 

100000001000011  1 . 

100000000010101  1 

001101111111111  -1 

010111110111111  -1 

011011111111110  -1 

011110111101111  -1 

011111011111101  -1 

011111101011111  -1 

01111111  J.  110011  -1 


Table  3.  Parity  check  set  used  to  decode  the  (15,5)  code 
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Section  5 


CONCLUSIONS 

As  pointed  out  in  Section  2,  we  are  now  in  a position  to 
translate  all  of  the  digital  decoding  algorithms  of  classical 
algebraic  coding  theory  into  analog  decoding  algorithms  via 
the  isomorphism  between  the  additive  S-domain  and  the  multipli- 
cative S' -domain.  There  are  some  enticing  experiments  begging 
to  be  conducted.  For  example,  how  well  would  a multiplicative 
BCH  decoding  algorithm  perform  when  applied  to  an  unquantized 
received  word?  How  are  we  to  interpret  such  a decoding  algorithm? 
The  very  concept  of  BCH  codes  and  their  decoding  algorithms  is 
based  on  the  assumption  that  a digital  error  vector  of  no  more 
than  t nonzero  components  has  been  added  to  the  code  word  in 
the  finite-field  domain  of  the  code.  From  the  viewpoint  of 
classical  algebraic  coding  theory,  it  makes  no  sense  to  talk 
about  error  vectors  with  analog  components.  Yet  we  know  that  a 
multiplicative  BCH  decoding  algorithm  will  correct  some  set  of 
error  vectors.  The  only  question  is:  what  set?  (Actually,  it 

is  probable  that  a raw,  unmodified  multiplicative  BCH  decoder 
would  decode  only  those  digital  errors  that  it  would  have  de- 
coded in  the  additive  domain,  since  a BCH  decoder  (unlike  a 
threshold  decoder)  may  elect  not  to  decode  a received  word.  But 
once  in  the  multiplicative  domain,  a great  many  "loosening  up" 
modifications  suggest  themselves.  One  might  consider  general- 
izing the  idea  of  "root",  for  instance.) 
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It  is  also  quite  possible  that  there  exists  an  analog  de- 
coding algorithm  (and  a new  class  of  codes,  perhaps?)  which  is 
analogous  to  BCH  decoding,  but  which  is  designed  to  correct  only 
those  error  vectors  which  fall  within  a continuous  sphere  of 
Euclidean  radius  t,  rather  than  being  designed  to  correct  only 
those  digital  error  vectors  which  fall  within  a discrete  "sphere" 
of  Hamming  radius  t.  Of  course,  the  ability  to  derive  analog 
decoding  methods  which  are  analogous  to,  rather  than  direct 
translations  of,  classical  digital  decoding  algorithms  requires 
an  understanding  of  the  principles  of  decoding  in  the  multi- 
plicative domain  which  we  do  not  yet  possess. 

In  addition  to  new  analog  decoding  algorithms  obtained  by 
direct  translation  of  existing  digital  algorithms,  or  construct- 
ed by  analogy  with  existing  digital  algorithms,  there  is  the 
possibility  of  devising  entirely  new  decoding  methods  which  have 
no  counterpart  in  classical  algebraic  coding  theory.  This 
possibility  stems  from  the  new  algebraic  properties  acquired 
when  we  move  from  the  classical  additive  domain  to  the  multi- 
plicative domain.  For  example,  in  this  new  domain  a digital 
decoding  function  can  always  be  extended  to  an  analytic  function. 
This  means  that  all  of  the  techniques  of  real  and  complex  analysis 
become  available.  One  thinks  immediately  of  hill-climbing  tech- 
niques which  will,  with  the  unquantized  received  word  as  the 
starting  point,  converge  to  the  nearest  code  word.  We  have  in 
fact  tried  this,  and  our  first  experiment  with  a convergence 
technique  had  an  outcome  which  we  should  have  been  able  to 
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predict . 

In  the  course  of  our  work  on  the  optimum  symbol-by-symbol 
decoding  algorithm  described  in  the  previous  section,  we  asked 
ourselves  how  we  could  modify  the  algorithm  to  do  optimum  word 
decoding  (and  thus  be  equivalent  to  correlation  and  Viterbi 
decoding) . We  designed  an  experiment  for  the  white  Gaussian 
channel  in  which  we  fed  back  the  unquantized  output  of  the 
optimum  symbol-by-symbol  decoder  to  the  input  (after  the  initial 
processing  of  the  unquantized  received  word)  and  then  iterated 
the  process  until  it  stablized.  To  our  delight,  the  contents 
of  the  fed-back  decoder  did  indeed  converge  to  the  nearest  code 
word.  We  then  became  interested  in  the  speed  of  convergence 
and  found  that  by  "fooling"  the  decoder*  into  thinking  that  the 
SNR  on  the  channel  was  higher  than  it  actually  was,  the  rate  of 
convergence  was  increased.  In  the  end  we  found  that  by  pro- 
viding the  decoder  with  a sufficiently  high  artificial  SNR,  the 
nearest  code  word  was  always  produced  on  the  first  pass.  In 
other  words,  iteration  was  not  necessary!  In  retrospect,  it  is 
obvious  that  this  should  be  the  case.  After  all,  one  need  not 
derive  the  SNR  for  correlation  or  Viterbi  decoding,  and  besides, 
the  optimum  symbol-by-symbol  decoding  algorithm  uses  all  of 
the  available  parity  checks,  which  should  have  led  us  to  suspect 

* 

"Fooling"  the  decoder  consists  of  using  an  artificially  high  or 
low  SNR  when  computing  the  likelihood  ratio.  This  has  the 
effect  of  changing  the  shape  of  the  (adaptive)  soft-decision 
decoding  function. 
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that  nothing  further  would  be  gained  by  iteration. 

What  this  result  does  suggest  is  that  there  should  be  a 
way  to  trade  off  speed  of  convergence  and  hardware  complexity. 

If  all  of  the  parity  checks  are  used,  the  hardware  component  of 
complexity  is  maximized  and  the  time  component  is  minimized.  If 
too  few  parity  checks  are  used,  the  decoder  does  not  converge  to 
the  nearest  code  word,  and  in  fact  may  not  converge  to  a code 
word  at  all.  Based  on  experience  in  other  areas  where  con- 
vergence techniques  are  widely  used,  we  expect  to  find  an 
operating  range  in  which  the  hardware  and  time  components  of 
complexity  can  be  traded  off  (with  a further  trade-off  involving 
error  probability) , and  that  the  optimum  operating  point  for 
most  applications  will  involve  iteration.  We  consider  this 
line  of  investigation  to  be  one  of  the  most  exciting  in  the 

period  ahead.  We  might  note  here  that  a similar  idea  has  been 

(45) 

suggested  by  Chase  who  plans  to  investigate  a "cascade" 

decoder  (a  soft-decision  decoder  following  by  a hard-decision 
decoder) . 

In  Section  3,  we  presented  a symbol-by-symbol  decoding  rule 
for  linear  codes  which  is  optimum  in  the  sense  that  it  minimizes 
the  probability  of  symbol  error  on  a time-discrete  memoryless 
channel  when  the  code  words  are  equiprobable.  A comment  or  two 
on  the  relationship  between  this  technique  and  correlation/ 
Viterbi  decoding  would  seem  to  be  in  order. 

First,  although  the  performance  of  correlation/Viterbi 
decoding  is  inferior  to  the  performance  of  the  decoding  rule 
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presented  here  on  a symbol-error  basis,  and  vice  versa  on  a 
word-error  basis,  some  preliminary  simulation  results  for  the 
white  Gaussian  channel  suggest  that  the  two  approaches  are  very 
close  in  performance  on  either  basis.  Symbol-error-rate  is 
generally  considered  to  be  a better  measure  of  performance  than 
word-error-rate,  especially  in  the  case  of  convolutional  codes, 
and  this  would  seem  to  give  a slight  edge  to  the  decoding  rule 
presented  here.  On  the  other  hand,  correlation/Viterbi  decoding 
is  applicable  to  nonlinear  as  well  as  linear  codes,  which  might 
be  an  advantage  in  some  applications.  Our  present  feeling  is 
that  for  all  practical  purposes  the  two  approaches  give  virtually 
the  same  performance. 

When  we  turn  to  the  question  of  complexity,  however,  there 
is  a considerable  difference  between  the  two  decoding  techniques. 
Correlation/Viterbi  decoding  is  only  practical  for  low  rate  or 
short  codes  whereas  the  symbol -by- symbol  decoding  rule  is  only 
practical  for  high  rate  or  short  linear  codes.  We  are  fairly 
well  convinced,  and  the  reader  may  be  able  to  convince  himself 
by  studying  the  examples  in  Section  3.2,  that  the  complexity  of 
the  symbol -by- symbol  decoding  rule  for  an  (n,k)  linear  code  is 
comparable  to  the  complexity  of  a correlation/Viterbi  decoder 
for  the  (n,n-k)  dual  code.  This  is  fairly  easy  to  see  in  the 
case  of  linear  block  codes,  but  not  so  obvious  in  the  case  of 
convolutional  codes  since  there  are  so  many  options  and  pro- 
gramming tricks  to  be  considered.  The  authors,  however,  are 
firm  believers  in  the  coding-complexity  Folk  Theorem:  "The 
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complexity  of  any  function  defined  on  a linear  code  is  com- 
parable to  the  complexity  of  (essentially)  that  same  function 
defined  on  the  dual  code".  (In  fact,  it  was  the  unsatisfying 
lack  of  a soft  decision  decoding  method  for  high  rate  linear 
codes  that  was  "dual"  to  correlat.ion/Viterbi  decoding  that 
motivated  the  research  reported  here.)  If  our  intuition  is 
correct,  then  the  symbol-by-symbol  decoding  rule  and  correlation/ 
Viterbi  decoding  should  be  of  comparable  complexity  for  rate  1/2 
codes.  We  remark  that  the  decoding  rule  for  linear  codes  over 
GF ( p)  can  be  generalized  in  a straightforward  fashion  to  linear 
codes  over  GF(pm)  by  using  the  generalized  finite  Fourier  trans- 
form of  (46,  pg.  367], 

For  very  high  SNR,  the  asymptotic  expression  for  bit  error 
probability  derived  in  Section  3.3  is  the  same  whether  optimum 
bit-by-bit  decoding  or  maximum- likelihood  word  decoding 
(i.e.  correlation)  is  used.  The  reason  for  this  is  easily  seen 
intuitively.  For  y^  -*•  °°,  with  high  probability  the  only  time 
a decoding  error  occurs  using  either  scheme  is  when  the  received 
word  lies  very  nearly  on  a straight  line  between  the  trans- 
mitted code  word  and  a "nearest  neighbor"  code  word,  slightly 
closer  to  the  neighbor.  In  either  case,  the  pattern  of  errors 
coincides  with  the  positions  in  which  the  transmitted  code  word 
and  the  neighbor  differ.  This  result  tends  to  support  the 
con  jecture  that  correlation  decoding  and  optimum  symbol- 

by-symbol  decoding  give,  for  all  practical  purposes,  the  same 
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performance  on  discrete  memoryless  channels. 

Finally,  we  conjecture  that  the  one-step  weighted  majority 
decoders  for  the  (17,9)  and  (15,5)  codes  derived  in  Section  4 
to  illustrate  the  two  parity  set  construction  methods  are  in 
fact  minimal  decoders  in  the  sense  that  they  use  the  fewest 
possible  parity  checks  for  t-error-correction.  It  is  interesting 
to  note:  (1)  that  in  the  (17,9)  decoder,  parity  checks  of  equal 

reliability  are  assigned  different  weights,  and  (2)  that  it  is 
apparently  necessary  to  use  the  complements  of  consistently 
"wrong"  parity  check  sums  to  obtain  a minimal  decoder  for  the 
(15,5)  code.  It  is  our  feeling,  however,  that  these  are  "quirks" 
related  to  digital  t-error-correction,  and  would  probably  not 
carry  over  to  soft-decision  decoding. 
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METRIC  SYSTEM 


BASE  UNITS: 

Quantity 


I ’nit 


length 

mass 

time 

electric  currant 
thermodynamic  temperature 
amount  of  substance 
luminous  intensity 

SUPPLEMENT  ARY  UNITS: 

plane  angle 
solid  angle 

DERIVED  UNITS: 
Acceleration 

activity  lof  a radioactive  sourcel 
angular  acceleration 
angular  velocity 

area 

density 

electrx  capacitance 
electrical  conductance 
electric  field  strength 
electric  inductance 
electric  potential  difference 
electric  resistance 
electromotive  force 
energy 
entropy 
force 

frequency 
illuminance 
luminance 
luminous  flux 
magnetic  field  strength 
magnetic  flux 
magnetic  flux  density 
magnetomotive  force 
power 
pressure 

quantity  of  electricity 
quantity  of  heat 
radiant  intensity 
specific  heat 
stress 

thermal  conductivity 
velocity 

viscosity,  dynamic 

viscosity,  kinematic 

voltage 

volume 

wavenumber 

work 


metre 

kilogram 

second 

ampere 

kelvin 

mole 

candela 


radian 

steradian 


metre  per  second  squared 

disintegration  per  second 

radian  per  second  squared 

radian  per  second 

square  metre 

kilogram  per  cubic:  metre 

farad 

siemens 

volt  per  metre 

henry 

volt 

ohm 

volt 

toule 

toule  per  kelvin 

newton 

hertz 

lux 

candela  per  square  metre 
lumen 

ampere  per  metre 

weber 

tesla 

ampere 

watt 

pascal 

coulomb 

joule 

watt  per  steradian 
loule  per  kilogram-kelvin 

past  al 

watt  per  metre-kelvin 
metre  per  second 
pascal-second 
square  metre  per  second 

volt 

cubic  metre 
reciprocal  metre 
loule 


SI  PREFIXES: 


Multiplic  ation  Factors 

l ooo  ooti  ooo  ooo  - in'1 

1 000  000  000  ^ 10’ 

1 000  000  =*  10* 

1 000  « 10’ 
100  - 10' 

10  - 10' 

0 1 = 10-  • 
001  « 10-* 
0 001  ■ 10*’ 
0 (MX)  001  » 10  * 

o txx)  ixx)  txii  = 10-’ 
0 000  000  000  001  * 10-’> 
0 000  000  OCX)  (XX)  001  10' '• 

0 000  000  000  000  000  001  - 10  '* 

• To  be  avoided  where  possible 


SI  Symbol  Formula 

m 

k-B 


A 

K 


mol 

cd 

rad 

tr 

m/s 

(disintegration)ls 

rad/s 

rad/s 

m 

kg/m 

F 

A-s/V 

S 

A/V 

V'm 

H 

V-a/A 

V 

W/A 

V/A 

V 

W/A 

I 

N-m 

KK 

N 

kg-m/s 

Hz 

(cycle)/s 

lx 

Im/m 

cd/m 

Im 

cd-sr 

A/m 

Wb 

V-s 

T 

Wb/m 

A 

W 

V* 

Pa 

N/m 

C 

A-s 

1 

N-m 

W/sr 

|/kg.K 

Pa 

N/m 

W/m-K 

m/s 

Pa-s 

m/s 

V 

1 

WA 

m 

|wave|/m 

N-m 

Prefix 

lera 

R>8« 

mega 

kilo 

hecto* 

deks* 

decl* 

rantl* 

mill! 

micro 

nano 

pIco 

fiimlo 

alto 


SI  Symbol 

T 

G 

M 

k 

h 

da 

d 

c 

m 

M 

n 

r 
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